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(57) ABSTRACT 

The present invention is a method and apparatus for auto- 
matic image segmentation using template matching fillers. 
The invention generally segments differing binary textures 
or structures within an input image by passing one or more 
structures while removing other structures. More 
particularly, the method and apparatus segment a stored 
binary image using a template matching filter that is 
designed to pass therethrough, for example, text regions 
while removing halftone regions. 

7 Claims, 7 Drawing Sheets 
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METHOD AND APPARATUS FOR 
AUTOMATIC IMAGE SEGMENTATION 
USING TEMPLATE MATCHING FILTERS 

This invention relates generally to a method and appa- 5 
ratus for automatic image segmentation using template 
matching filters, and more particularly to a method and 
apparatus for segmenting regions of differing texture or 
structure within a stored binary image using a template 
matching filter that is designed to pass at least one texture 10 
while removing one or more other textures. 

CROSS REFERENCE 

The following related applications are hereby incorpo- 
rated by reference for their teachings: 

U.S. patent Sen No. 08/004,479 by Shiau (published at 

EP-A2 0 521 662 oD Jan. 7, 1993), now U.S. Pat. No. 

5,293,430; 

"Method for Dcsiga and Implementation of an Image ^0 
Resolution Enhancement System That Employs Statis- 
tically Generated Look-Up Tables," Loce et al., Ser. 
No. 08/169,485, filed Dec. 17, 1993, now U.S. Pat. No. 
5,696,845; 

"Non-Integer Image Resolution Conversion Using Statis- 25 
tically Generated Look-Up Tables,^' Loce et al., Ser, 
No, 08/170,082, filed Dec. 17, 1993, now U.S. Pat. No, 
5,387,985; 

"Method for Statistical Generation of Density Preserving 
Templates for Print Enhancement," Loce et al,, Ser. No. 
08/169,565, filed Dec. 17, 1993, now U,S, Pat. No. 
5,359,423; 

"Automated Template Design for Print Enhancement," 
Eschbach, Ser. No. 08/169,483, filed Dec. 17, 1993 , 
now U.S. Pat. No, 5,724,455; and 

"Image Resolution Conversion Method that Employs 
Statistically Generated Multiple Morphological 
Filters," Loce et al, Ser. No. 08/169,487, filed Dec, 17, 
1993, now U.S. Pat. No. 5,579,445. 

INCORPORAnON BY REFERENCE 

U.S. Pat. No, 4,194,221 to Stoffel, U.S. Pat. No. 4,811,115 
to Lin et al., and U,S, Pat. No. 5,131,049 to Bloomberg et al. 
are hereby specifically incorporated by reference for their 45 
teachings regarding image segmentation. 

BACKGROUND AND SUMMARY OF THE 
INVENTION 

The present invention is a novel approach to separating 50 
text, halftones, or other image structures in composite 
images using template-based filtering methods. A key appH- 
cation of the present invention is the segmentation of text 
regions from halftone regions. In the reproduction of an 
original document from video image data created, for 55 
example, by electronic raster input scanning from an origi- 
nal document, one is faced with the limited resolution 
capabihties of the reproducing system and the fact that 
output devices remain predominantly binary. This is par- 
ticularly evident when attempting to reproduce halftones, 60 
hues and continuous tone images. Of course, an image data 
processing system may be tailored so as to offset the limited 
resolution capabilities of the reproducing apparatus used, 
but this is difficult due to the divergent processing needs 
required by the different image types that may be encoun- 65 
tered. In this respect, it should be understood that the image 
content of the original document may consist entirely of 
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high frequency hafftones, low frequency halftones, continu- 
ous tones, text or line copy, or a combination, m some 
unknown degree, of some or aU of the above. Optimizing the 
image processing system for one image type in an effort to 
offiset the limitations in the resolution capability of the 
reproducing apparatus used, may not be possible, requiring 
a compromise choice that may not produce acceptable 
results. Thus, for example, where one optimizes the system 
for low frequency halftones, it is often at the expense of 
degraded reproduction of high frequency halftones, or of 
text or line copy, and vice versa. Beyond the issue of 
accurate reproduction, segmentation of different image types 
is key to the successful apphcation of recognition algorithms 
(e.g., character recognition and glyph recognition) and effi- 
cient application of image compression techniques. 

As one example of the problems encountered, reproduc- 
tion of halftooed images with screening tends to introduce 
moire, caused by the interaction of the original screen 
frequency and applied screen frequency. Although the use of 
high frequency fine screens can reduce the problem, the 
artifact can still occur in some images. In a networked 
environment particularly, it is desirable that the image 
processing device (e.g., raster input scarmer) detect the 
halftone, and low-pass filter the document image into a 
continuous tone for subsequent halftone reproduction by 
printers in the network in accordance with their particular 
capabihties. 

Heretofore, a number of applications, patents and pubh- 
cations have disclosed techniques for segmentation of digital 
image data, the relevant portions of which may be briefly 
summarized as foUows: 

U.S. pateat apphcation Ser. No. 08/044,479 to Shiau, 
teaches a particular problem noted in the use of an auto 
correlation function of the false characterization of a portion 
of the image as a hafftone, when in fact it would be 
preferable for the image to be processed as a line image. 
Examples of this defect are noted particularly in the pro- 
cessing of Japanese Kanji characters and small Roman 
letters. In these examples, the auto correlation function may 
detect the image as halftones and process accordingly, 
instead of applying a common threshold through the char- 
acter image. The described computations of auto correlation 
are one dimensional in nature, and this problem of false 
detection will occur whenever a fine pattern that is periodic 
in the scan line or fast scan direction is detected. In the same 
vein, shadow areas and highHght areas are often not detected 
as halftones, and are then processed with the apphcation of 
a uniform threshold. 

U,S, Pat. No. 4,194,221 to Stoffel, issued Mar. 18, 1980, 
discloses the problem of image segmentation. The problem 
was addressed by applying a discrimination function 
instructing the image processing system as to the type of 
image data present and particularly, an auto correlation 
ftinction to the stream of pixel data, to determine the 
existence of halftone image data. Stoffel describes a method 
of processing automatically a stream of image pixels repre- 
senting unknown combinations of high and low frequency 
halftones, continuous tones, and/or lines to provide binary 
level output pixels representative of the image. The 
described function is applied to the stream of image pixels 
and, for the portions of the stream that contained high 
frequency halftone image data, notes a large number of 
closely spaced peaks in the resultant signal. The correlator 
circuits described in Stoffel's embodiment, however, are 
very expensive, as they must provide a digital multiphcation 
function. Accordingly, as a practical matter, Stoffel requires 
as a first step, reduction of the amount of data handled, by 
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initially thresholding image data against a single threshold putational btirden associated with digital morphological 

value, to reduce the image to a high contrast black or white filter desiga Although the resulting filter is suboptimal, 

image. However, depending on the selection of the threshold imposition of the constraints in a suitable manner results in 

as compared to the intensity of the image, significant little loss of performance in return for design tractability. 
amounts of information may be lost in the thresholding 5 Mathematical Morphology in Image Processing, pp. 

process. For example, if the threshold level is set to distin- 43-90 (Edward R. Dougherty ed., Marcel Dekker 1992), 

guish in the middle of the intensity range, but the image has hereby incorporated by reference, describes efficient design 

significant variations through the darker gray levels, the strategies for the optimal binary digital morphological fiher. 

thresholded result does not indicate the variations. This A suboptimal design methodology is investigated for binary 
results in an undesirable loss of image information. While it lo filters in order to facilitate a computationally manageable 

may be possible to vary the threshold value adaptively from design process. 

original to original and from image area to image area, such Robert R Loce et al., in Optimal Morphological Resto- 

algorithms tend to be complicated and work well only for a ration: The Morphological Filter Mean-Absolute-Error 

restricted class of images such as line images. Theorem, Journal of Visual Communications and Image 

U.S. Pat. No. 4,811,115 to Lin et al., issued Mar. 7, 1989, 15 Representation, (Academic Press), Vol. 3, No. 4, December 
teaches an auto correlation function that is calcidated for the 1992, pp. 412r-432, hereby incorporated by reference, teach 
stream of halftone image data at selected time delays that are expressions for the mean-absolute restoration error of gen- 
predicted to be indicative of the image frequency cral morphological filters formed from erosion bases in 
characteristics, without prior thresholding. The arithmetic tennsof mean- absolute errors of single-erosion filters. In the 
function used in that auto correlation system is an approxi- ^ binary setting, the expansion is a union of erosions, while in 
mation of the auto correlation function that employs logical the gray-scale setting the expansion is a maxima of erosions, 
functions and addition, rather than the multiplication func- Expressing the mean-absolute-crror theorem in a recursive 
tion used in U.S, Pat. No. 4,194,221 to Stoffel. Valleys in the form leads to a unified methodology for the design of 
resulting auto correlated function arc detected to determine optimal (suboptimal) morphological restoration filters, 
whether high frequency halftone image data is present. ^5 Applications to binary-image, gray-scale signal, and order- 

U.S. Pat. No. 5,065,437 to Bloomberg, issued Nov. 12, statistic restoration on images are included. 

1991, discloses a method for separating finely textured and Edward R. Dougherty et al., in Optimal mean-absolute- 
solid regions in a binary image. Initially an operation is error hit-or-miss filters: morphological representation and 
carried out on the image to thicken text and lines and to estimation of the binary conditional expectation, Optical 
solidify textured regions. The image is then subjected to a ^° Engineering, Vol. 32, No. 4, April 1993, pp. 815-827, 
second set of operations that eliminates ON pixels that are incorporated herein by reference, disclose the use of a 
near OFF pixels, thereby thinning out and ehminating the hit-or-miss operator as a building block for optimal binary 
previously thickened text and lines, but leaving the previ- restoration filters. Filter design methodologies are given for 
ously solidified textured regions. general-, maximtun-, and minimum-noise environments and 

U.S. Pat. No. 5,131,049 to Bloomberg, issued Jul. 14, ^r iterative filters, 

1992, discloses a method for creating a mask for separating Robert P. Loce, in Morphological Fiher Mean-Absolute- 
halftone regions in a binary image from other regions. The Error Representation Theorems and Their Application to 
method includes constructing a seed image, constructing a Optimal Morphological Filter Design, Center for Imaging 
clipping mask, and filling the seed while clipping to the Science, Rochester Institute of Technology, (Ph.D. Thesis), 
mask. 1993, incorporated herein by reference, discloses 

U.S. Pat. No. 5,341,226 to Shiau, issued Aug, 23, 1994, ^^^^ign methodologies for optimal mean-absolute-error 
discloses a method and apparatus for processing color P^^) morphological based filters, 
document images to determine the presence of particular In accordance with the present invention, there is pro- 
image types in order to designate areas for optimal image 45 vided a method performed in an digital processor for pro- 
processing thereof. A multi-separation image defined in cessing a document image to determine image types present 
terms of color density for each separation is converted to a therein, the steps comprising; 

luminance-chrominance definition, where one component of receiving, from an image source, a document image 

the image represents image intensity. An image segmenta- having a plurality of pixels therein, each pixel repre- 

tion process operates on the image intensity signal, the sented by a density signal, and storing at least a portion 

results of which are used to determine processing of the thereof representing a region of the document image in 

multi-separation image. a data buffer; 

UK-A-2,153,619, published August 1985, teaches a simi- retrieving, from the data buffer, the density signals for the 

lar determination of the type of image data. However in that document image; 

case, a threshold is applied to the image data at a certain 55 determining, using template matching filters, image types 

level, and subsequent lo thresholding the number of U-ansi- present in the region of the document image, 

tions from light to dark within a small area is counted. The In accordance with another aspect of the present 

system operates on the presumption that data with a low invention, there is provided an apparatus for processing 

number of transitions after thresholding is probably a high binary image pixels in an image represented by a plurality of 

frequency halftone or continuous tone image. The thresh- rasters of pixels, to preferentially pass regions having a first 

olding step in this method has the same undesirable effect as stmcture therethrough so as to produce an output image 

described for Stoffel. primarily comprised of regions exhibiting the first structure, 

Robert P. Loce et al. in Facilitation of Optimal Binary including: 

Morphological Filter Design via Structuring Element an image memory for storing the binary image signals; 
Libraries and Design Constraints, Optical Engineering, Vol. 65 a window buffer for storing a plurality of image signals 

31, No. 5, May 1992, pp. 1008-1025, incorporated herein by from a plurality of rasters, said image signals repre- 

reference, describes three approaches to reducing the com- senting pixels centered about a target pixel; 
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a template filter to generate an output image signal as a template -based segmentation circuit in accordance with one 

function of the image signals stored in the window aspect of the present invention; and 

buffer, wherein the output signal is equivalent to the piGS. 6 and 7 are daU flow diagrams illustrating the 

image signal for regions of the binary image where the various stages in the process used to design the template - 
target pixel represents the first structure, and where the 5 based segmentation filters of the present invention, 

output signal is zero for regions of the binary image ^^^^^ invention will be described in connection 

where the target pixel represents another structure; and ^ ^^^^^^^^ embodiment, however, it will be understood 

an output memory for stonng the output signal for each of there is no intent to limit the invention to the embodi- 

a plurality of target pixels, wherein the signals stored in ^ent described. On the contrary, the intent is to cover all 
each location of said output naemory are generated by lo alternatives, modifications, and equivalents as may be 

said template filter as a function of the image signals included within the spirit and scope of the invention as 

within a window whose contents are determined as a defined by the appended claims, 
function of the corresponding target pixel location. 

In accordance with yet another aspect of the present DESCRIPTION OF THE PREFERRED 

invention, there is provided an apparatus for processing 15 EMBODIMENT 
binary image pixels in an image represented by a plurality of 

rasters of pixels, to identify regions exhibiting a particular » general understanding of the present invention, 

structure therein, comprising: reference is made to the drawings. In the drawings, like 

an image source for producing a document image having 'f'^''^ « f^^^f been used throughout to designate 
a pluraHty of pixels therein, each pixel represented by ^ ff^'^^ e ements. In describmg the present invenUon, the 

a density signal' foUowmg term(s) have been used m the descnption. 

memo^i^'LZkg at least a portion ofthe density signals . P« "daU" refers herein to physical signals that 

representing a region of the document image in a data ^'^^''^ °' "fude mformaUon. When an item of data can 

buffer- and mdicate one of a number of possible alternatives, the item of 

' 1 „ 1, . u- ^ data has one of a number of "values." For example, a binary 

a segmentation circuit employing template-matchmg fil- ^ ^ ^^^^^^ ^ ..^j 

ters to Identify the presence of the parUcular structure ^ interchangeably referred to as "1" and "0" or "ON" 

m the region of the image stored m said memory ..^pp,, ..g^^" and "low." A bit is an "invei^" of 

One aspect of the mvenUon is based on the discovery that ^^^^^ ^.^ ^^^^ ^^^^^^ ^^^^^ ^ ^^.^ 

templates may be employed to recogni^s. one binary stiuc- 3^ itemof data has one of a possible 2^ values. Tlie term "data" 

ture within one or more textures. More specifically, ijj* - ■ u w j ij 

..10, . -, . . . r mcludes data existmg m any physical form, and includes 

template-based filters may be used to reco^ize regions of ^^^^ ^^^^ transitotfor are being stored or transmitted. For 

an image that contain text and line art. This discoveiy further j ^^^^ ^ electromagnetic or other trans- 

avoids problems that arise m tecnmqucs that attempt to j ■ i • i * j • i * • 

avuiu^ ^i^juiwiua Liiau oiiov lu uvwiiiii^i mittcd siguals or as siguals storcd m clectTomc, magncUc, or 

cover a broad range of document types, as the present ^5 ^^^^ 

invention farther enables the "customization*' of the . * .... . - , c 

template-based filters used therein in response to training Circuitry or a curcuif is any physical arrangement of 

documents that are representative of docmnents commonly ^^^^^^ ^^at can respond to a first signal at one locaUon or 

encountered by the image processing system. TOs aspect is ^ime by providing a second signal at another location or 
further based on the discovery of techniques that generate 

statistical representations of the patterns found in text and A "data storage medium** or "storage medium" is a 

halftone regions of documents as further described, for physical medium that can store data. Examples of data 

example, by Eschbach in U.S. application Ser No. 08/169, storage media include magnetic media such as diskettes, 

483 and Loce et al, in U.S. application Ser. No. 08A69,485. floppy disks, and tape; optical media such as laser disks and 

The technique described herein is advantageous because CD-ROMs; and semiconductor media such as semiconduc- 

it is inexpensive compared to other approaches and is tor ROMs and RAMs. As used herein, "storage medium" 

flexible, in that it can be adapted to any of a number of input covers one or more distinct units of a medium that together 

document types exhibiting a wide range of possible patterns. store a body of data. For example, a set of floppy disks 

As a result of the invention, a low-cost image segmentation storing a single body of data would together be a storage 

system may be accomplished. medium. "Memory circuitry**, "memory", or "register" is 

any circuitry that can store data, and may include local and 

BRIEF DESCRIPTION OF THE DRAWINGS ^^^^^^ memory and input/output devices. 

FIG. 1 is a general block diagram showing an embodi- A "data processing system" is a physical system that 

ment of the automatic image segmentation apparatus in processes data. A "data processor" or "processor" is any 

accordance with the present invention, where the invention component or system that can process data, and may include 

is employed in a document reproduction system; one or more central processing units or other processing 

FIG. 2 is a data flow diagram illustrating a two -stage components, 

image segmentation process in accordance with the present A processor or other component of circuitry "uses" an 

invention; item of data in performing an operation when the result of 

FIG. 3 is a flowchart illustrating the various steps in a ^e operation depends on the value of the item. For example, 

serial process used to apply a template-based segmentation the operation could perform a logic 01 arithmetic operation 

filter to the input image of FIG. 1; on the item or could use the item to access another item of 

FIGS, 4A and 4B are pictorial representations of the data, 

operation of the template-based segmentation filter on a An "image" is generally a pattern of physical light. An 
region of an image in accordance with the present invention; 65 image may include characters, words, and text as well as 

FIG. 5 is a simplified electrical schematic illustrating the other features such as graphics. Text may be included in a set 

implementation of a parallel processing scheme for a of one or more images. An image may be divided or 
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segmented into "segments" or "regions," each of which is 
itself an image. The "structure" of an image segment or 
region is generally determined by the primary content of the 
region including, for example, text, halftone or graphics 
structures. A segment of an image may be of any size up to 
and including the whole image. An image may also refer to 
a two-dimensional data array that represents a pattern of 
physical light. A "document," which may exist in either 
hardcopy (written or printed) or electrical (data array) form, 
is a representation of one or more images and/or text. A 
document may include multiple pages. 

An item of data "defines" an image when the item of data 
includes sufiScient information to produce the image. For 
example, a two-dimensional array can define all or any part 
of an image, with each item of data in the array providing a 
value indicating the color and/or intensity of a respective 
location of the image. If a two-dimensional array or other 
item of data defines an image that includes a character, the 
array or other data also defines the character. 

Each location in an image may be called a "pixel." A 
"pixel" is the smallest segment into which an image is 
divided in a given system. In an array defining an image in 
which each item of data provides a value, each value 
indicating the color and/or intensity of a location may be 
called a "pixel value". Each pixel value in a binary image is 
an electrical signal in a "binary form", a gray-scale value in 
a "gray-scale form" of an image, or a set of color space 
coordinates in a "color coordinate form" of an image, the 
binary form, gray-scale form, and color coordinate form 
each being a two-dimensional array defining an image. 
Hence, the term pixel may also refer to the electrical (or 
optical) signal representing the measurable optical proper- 
ties of a physically definable region on a display medium. A 
plurality of physically definable regions for either situation 
represents the measurable properties of the entire image to 
be rendered by either a material marking device, electrical or 
magnetic marking device, or optical display device. Lastly, 
the terra pixel may refer to an electrical (or optical) signal 
representing physical optical property data generated fi-om a 
photosensitive element when scanning a physical image, so 
as to convert the physical optical properties of the image to 
an electronic or electrical representation. In other words, in 
this situation, a pixel is an electrical (or optical) represen- 
tation of the optical properties of an image measured at a 
definable area by an optical sensor. 

An item of data "relates to" part of an image, such as a 
pixel or a larger segment of the image, when the item of data 
has a relationship of any kind to the part of the image. For 
example, the item of data could define the part of the image, 
as a pixel value defines a pixel; the item of data could be 
obtained from data defining the part of the image; the item 
of data could indicate a location of the part of the image; or 
the item of data could be part of a data array such that, when 
the data array is mapped onto the image, the item of data 
maps onto the part of the image. An operation performs 
"image processing" when it operates on an item of data that 
relates to part of an image. 

Pixels are "neighbors" or "neighboring** within an image 
when there are no other pixels between them or when they 
meet an appropriate criterion for neighboring, such as falling 
within a positioned observation window. If the pixels are 
rectangular and appear in rows and columns, each pixel may 
have 4 or 8 connected neighboring pixels, depending on the 
criterion used. 

An "image input device" is a device that can receive an 
image and provide an item of data defining a representation 
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of the image. A "scanner** is an image input device that 
receives an image by a scanning operation, such as by 
scanning a hardcopy document for example, the Xerox 7650 
Pro Imager scanner and Xerox 7009 Facsimile Terminal are 

5 devices which receive hardcopy documents to produce data 
defining an image. Other image input devices include data 
processing systems suitable for generating digital docu- 
ments in response to instructions from a user. 

An "image output device*' (101^ is a device that can 

JO receive an item of data defining an image and provide the 
image as output. A "display** is an image output device that 
provides the output image in human viewable form. The 
visible pattern presented by a display is a "displayed image*' 
or simply "image.** A "printer" or "marking engine*' is an 

15 image output device capable of rendering the output image 
in human readable form on a removable medium. 

Turning now to FIG. 1, which shows an embodiment of 
the automatic image segmentation apparatus, employed in 
an image reprographic setting in accordance vnth the present 

20 invention, the general components of digital printer 12 are 
depicted. More specifically, an input image 10 would be 
presented to digital printer 12 to produce a printed output 20. 
Within digital printer 12 a segmentation filter 14 transforms 
the input image in accordance with the present invention 

25 into at least two segmented images, in a simplified case, to 
segment text and halftone regions thereof. The segmented 
image bitmaps are, in turn, passed to an image processing/ 
recombination circuit 15. As will be further described, image 
processing circuit processes the segmented images to pro- 

30 duce an output image 16, that is optimized for the given 
marking process. Alternately, while in segmented form, the 
specific image segments may be isolated and grouped into 
regions using techniques, such as morphological opening or 
closing. Once a segment or region is isolated, each pixel may 

35 be tagged by setting the state of an associated tag bit in 
accordance with the image type (e.g., text, halftone, other). 
The tagged sections may then be recombined into a single 
bit-map with tags. When passed on to subsequent 
operations, the individual pixels within a region are treated 

40 in a manner that is optimized for the particular region. 
Subsequently, output image 16 may be passed to marking 
engine 18 for exposure and development, as is well-known, 
to produce output print 20. 

Referring now to FIG. 2, the data flow diagram illustrates 

45 a two -stage image segmentation process that is carried out 
within segmentation filter 14 of FIG. 1. Although depicted as 
a preferable two-stage filter, it is understood that a single- 
stage iteration could have been employed or that additional 
stages or iterations could be used to further reduce the error 

50 in classification of text and halftone regions of the image. In 
the figure, input document 10, including regions of both text 
24 and halftone 26, is passed to a first iteration fiUer circuit 
30. Input document 10 is preferably a plurality of binary data 
signals representing, for example, the text, halftone and 

55 graphic image regions that make up the document. The input 
document image may be produced as a digitized represen- 
tation of a hardcopy document scanned on a scanner. In the 
first filter circuit, the document image is filtered by com- 
paring segments thereof with predefined patterns, referred to 

60 herein as templates, stored as LUTj. As is further illtistrated 
in the flowchart of FIG. 3, and associated examples of FIGS. 
4A and 4B, once the input document is obtained, step 100, 
a target pixel "X** is identified, and a set of surrounding 
pixels are treated as a vnndow, step 102. FIG. 4A illustrates 

65 a portion of a digitized image 130 having an upper portion 
130a containing text, and a lower portion 130b containing a 
halftone region. Magnified portions 132 and 134, of image 
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130, are shown in FIG. 4B, where the individual pixels 
comprising the image are discernible. 

The values of the pixel signals within the window are then 
compared to the templates stored in memory, for instance, 
the templates 136fl-136/ and 138 illustrated in FIG. 4. 
Preferably, the template fiher is implemented as a look-up 
table (LUT). When a matching template is foxmd, step 106, 
the target pixel is identified as a text pixel, step 120, and 
allowed to pass unchanged through the filter, step 122. 
Otherwise, the process continues at step 108 where the 
presence of further templates is tested. If ftirther templates 
are available for comparison with the window, the process 
continues at step HO. Otherwise, when no further templates 
are available, the pixel is identified as being representative 
of a halftone segment or background region of the input 
image, step 112, and an output of "0** or an "OFF" pixel 
signal is produced, step 114. 

In a preferred multiple iteration embodiment, following 
the first iteration as described above, the filter attempts to 
identify error pixels and further remove them from, say, the 
text image. Because errors pixels tend to be much sparser 
than the identified text pixels, a different class of filters could 
be used for successive iterations. For example an order- 
statistic filter could be used, where if less than a predeter- 
mined number of pixels are active within a neighborhood 
window, the target pixel will be considered an error pixel. 
Alternatively, a similar neighborhood checking could be 
performed with morphological filters. 

Note that the serial method described above with respect 
to the flowchart of FIG. 3, which is appropriate for software 
implementations of the LUT, may also be accomplished 
using electrical circuitry. Higher speed performance many 
be obtained using a hardware implementation where the 
LUT's would be implemented using an Application Specific 
Integrated Circuit (ASIC) or Programmable Logic Array 
(PLA). 

As a simplified example of such an embodiment, the 
electrical circuit schematic of FIG. 5 is provided. In filter 
circuit 200, the document image is filtered by comparing 
segments thereof with predefined patterns, referred to herein 
as templates, stored as look-up tables (32 or 42). As is 
illustrated in FIG. 5, once the input document is obtained a 
target pixel X is identified, and a set of surrounding pixels 
are treated as a window, 202. The values of the pixel signals 
within the window are then transferred to a register 204, or 
similar memory location suitable for holding data signals 
representative of the pixels within window 202. 

Using a plurality of logic gates 206 (which may be a 
combination of AND or NOR gates depending upon binary 
value in any particular template position), or similar logic 
operations, the signals stored in register 204 are compared to 
the templates stored as a series of signals in LUT memory; 
where the set of stored signals for each template represents 
a unique pixel pattern to be detected within the window. For 
example, LUT^ 32 or 42 would be populated with templates 
similar to those of 136^-136/ and 138 as illustrated in FIG. 
4. It is further noted that, while illustrated for simplicity as 
single elements 206, implementation of the parallel com- 
parison operation described would require a plurality of 
logic gates for each template or look-up table enUry, as would 
be apparent to one skilled in the art. As is further apparent 
to one skilled in the art, logic minimization techniques may 
be employed to enable the rapid, parallel comparison of the 
LUT templates with the values stored in register 204. 

After the parallel comparison step, accomplished by com- 
paring the signals in register 204 with the associated signals 
in each entry of LUT 32, 42, any match between the signal 
sets would result in a positive logic signal being passed into 
the logical OR array, represented by reference numeral 208. 
As previously described with respect to the text structure 
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template-based filter, the output of the OR gate array would 
identify target pixel X as a text pixel, where it may be 
allowed to pass unchanged. Otherwise, the target pixel could 
be identified as a non-text region and flagged to produce an 

J output of "0** or an "OFF" pixel signal. 

Once processed in accordance with the steps of FIG. 3 or 
by the circuitry of FIG. 5, both being preferably imple- 
mented with a look-up table 32, the first filter output image 
34 results. Subsequently, a second iteration of a template- 
based filter 40 is accomplished in a manner similar to that 

10 previously described with respect to FIGS. 3, 4A-B and 5. 
More specifically, referring again to FIG. 2, the first filter 
output IS then used as the input and compared against the 
templates stored in LUT^ (42), to generate the error image 
44. Subsequently, error image 44 is XORed (46) with the 
first output image to produce the text-only output image 48. 
Alternatively, the text-only output image 48 may be further 
XORed (50) with the input document to produce halftone- 
only output image 52. Thus, segmented binary images 48 
and 52 may be output, said segmented images primarily 
comprising marks representing one of the structures passed 

20 by the filters. For example, segmented images 48 and 52 are, 
respectively, images having only text or halftone image 
segments therein. 

It is further understood that the output of the filtering 
process, as depicted in FIGS. 3, 4A-B and 5, may also be a 

25 purely binary signal indicating whether a match was located 
for each particular target pixel considered. In this manner, 
the output of the template-based filters would be binary in 
nature, and would not necessarily allow the passage or 
masking of the image segments without performing further 

3Q logic operations on the input image. 

Referring again to FIG, 1, once the segmented images are 
produced by segmentation filter 14, they may be subse- 
quently operated on by a image processing and/or recom- 
bination operation, represented as block 15. In particular, the 
image processing operations may employ filters and other 
well-known techniques specifically designed to process the 
segmented image structures isolated by segmentation filter 
14. Furthermore, once processed, the segmented images 
may be recombined to form output image 16 in a form 
suitable for improved rendition by marking engine 18, 

Turning now to FIGS. 6 and 7, data flow diagrams 
illustrating the various stages in the process used to design 
the template-based segmentation filters of the present inven- 
tion will now be described. As illustrated in FIG. 6, LUT^ is 
produced by using a pair of training documents wherein the 

45 first training document 150 is a digital representation of an 
electronic document containing both text and halftone 
regions. The second training document, document 152, is 
identical to the first training document, except that it has 
been edited to remove the halftone regions therein. To 

50 produce the templates to be stored in LUTi, the first and 
second training documents are passed to template matching 
program 156. Program 156 works in accordance with the 
methods described by Loce et al. in "Non-Integer Image 
Resolution Conversion Using Statistically Generated Look- 

55 Up Tables," Ser. No. 08/170,082, filed Dec. 17, 1993, by 
Loce et al. in "Method for Design and Implementation of an 
Image Resolution Enhancement System That Employs Sta- 
tistically Generated Look-Up Tables," Ser. No. 08/169,485, 
filed Dec. 17, 1993, which is hereby incorporated by refer- 
ence for its teachings. Generally, the filter design process 

60 accomplished by the template matching program will allow 
for design of optimized template-matching filters that arc 
then stored in a programmable memory as LUTj, As is 
apparent, many aspects of the present invention or the 
associated template design process may be accomplished or 

65 simulated using a programmable data processing system. 
In the application cited above, the LUT design process 
produces a filter that results in a minimum number of pixels 
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in error when applied to an input image. In the present case, 
it may be more important to not make errors in the halftone 
portion of the image, as opposed to the text portion of the 
image. Therefore in an alternative embodiment it may be 
preferable to apply a weighting factor, greater than 1, to 5 
halftone pixels in the training document, so the statistical 
design procedure attempts to minimize halftone pixel clas- 
sification error more than text pixel classification error. Id 
general, it may be preferable to weight pixels of one 
structure in the training documents. The weighting could be 
straightforward such as each halftone pixel is figured into 
the statistics as N pixels would normally be treated. 

Once the first stage or iteration of the segmentation filter 
is generated and stored as LUT^, the second stage may be 
produced. As illiistrated by the data flow diagram of FIG. 6, ^ 
the original image 150 is passed to the first-stage segmen- 
tation filter represented as LUT^. The output, filtered image 
154 is then stored so that it can be passed to XOR logic 
circuit 158, where it is XORed with the text-only training 
document 152. The output of the XOR operation 158, error 
image 160, is then passed to template matching program 156 20 
along with the filtered image 154. In this second occurrence 
of the template matching program, the output will be a series 
of templates depicted as LUT^. It should be noted that 
additional iterations of the second segmentation filter design 
process, FIG. 6, would be necessary to generate additional 25 
templates (LUTJ to accomplish further segmentation filter- 
ing. 

In recapitulation, the present invention is a method and 
apparatus for automatic image segmentation using template 
matching filters. The invention generally segments differing 30 
binary textures or structures within an input image by 
passing one or more structures while removing other struc- 
tures. More particularly, the method and apparatus segment 
a stored binary image using a template matching filter that 
is designed to pass therethrough, for example, text regions 
while removing halftone regions. 

It is, therefore, apparent that there has been provided, in 
accordance with the present invention, a method and appa- 
ratus for efficiently employing template based filters to 
accomplish image segmentation. While this invention has 
been described in conjunction with preferred embodiments 
thereof, it is evident that many alternatives, modifications, 
and variations will be apparent to those skilled in the art. 
Accordingly, it is intended to embrace all such alternatives, 
modifications and variations that fall within the spirit and 
broad scope of the appended claims. 
What is claimed is: 
1. A method performed in an digital processor for pro- 
cessing a document image to determine image types present 
therein, the steps comprising: 

receiving, fi-om an image source, a document image 
having a plurality of pixels therein, each pixel repre- 
sented by a density signal, and storing at least a portion 
thereof representing a region of the document image in 
a data buffer; 

retrieving, from the data buffer, the density signals for the 
document image; and 

determining, using template matching filters, image types 
present in the region of the document image by apply- 
ing a first stage template matching filter to the density 
signals for the document image to produce a first 
filtered output image, said step of applying a first stage 
template matching filter to the density signals for the 
document image comprising 
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identifying a window within the document image so as 
to select a subset of the document image density 
signals and comparing the density signals in the 
window to a pattern within the template based filter, 
the pattern representing a pattern determined to 
occur in a textual segment of an image wherein a 
plurality of signals representing a pattem within the 
template-based filter are stored in each location of a 
look-up table memory and the step of comparing the 
density signals in the window to a pattern within the 
template -based filter comprises 

storing the subset of document image density signals in 
a register memory, 

logically comparing each density signal in the register 
memory with a uniquely associated signal of the 
template based filter stored in the look-up table 
memory location, and 

outputting a logic signal indicative of the result 
obtained in the logical comparing step 

applying a second stage template matching filter to the 
first filtered output image to produce an error image, 
and XORing the error image and the first filtered 
output image to mask from the first filtered output 
image any density signals from segments of the 
document image not comprised of text to produce a 
first output image, wherein the first output image 
contains only textual segments therein. 

2. An apparatus for processing binary image pixels in an 
image represented by a plurality of rasters of binary image 
pixels, each representing the binary state of a single pixel 
within the image, to identify regions exhibiting a particular, 
unique binary pixel structure therein, comprising: 

an image source for producing a document image having 
a plurality of binary image pixels therein, each pixel 
represented by a binary density signal; 

memory for storing at least a portion of the binary density 
signals representing a region of the document image in 
a data buffer; and 

a segmentation circuit employing template-matching fil- 
ters to identify the presence of the particular, unique 
binary pixel structure in the region of the image stored 
in said memory, the segmentation circuit further com- 
prising a logic filter for removing the particular, unique 
binary pixel structure from the region of the image 
stored in said memory to produce an output image 
substantially void of the particular, unique binary pixel 
structure. 

3. The apparatus of claim 2, wherein said segmentation 
circuit includes a multiple-stage template-based filter. 

4. The apparatus of claim 3, wherein a first stage of said 
multiple -stage filter includes a look-up table prepro- 
grammed with a plurality of entries, each of said entries 
corresponding to a pixel pattem determined to represent a 
segment of a training image having only textual content. 

5. The apparatus of claim 4, wherein a second stage of 
said multiple-stage filter includes a look-up table prepro- 
grammed with a plurality of entries, each of said entries 
corresponding to a pixel pattern determined to represent a 
segment of a training image having only textual content. 

6. The apparatus of claim 2, wherein the particular, unique 
binary pixel structure is a halftone. 

7. The apparatus of claim 2, wherein the particular, unique 
binary pixel structure is text 
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